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[i]  This  paper  describes  a  new  forecasting  tool  developed  for  and  currently  being  tested  by  NASA's 
Space  Radiation  Analysis  Group  (SRAG)  at  Johnson  Space  Center,  which  is  responsible  for  the 
monitoring  and  forecasting  of  radiation  exposure  levels  of  astronauts.  The  new  software  tool  is  designed 
for  the  empirical  forecasting  of  M-  and  X-class  flares,  coronal  mass  ejections,  and  solar  energetic  particle 
events.  For  each  type  of  event,  the  algorithm  is  based  on  the  empirical  relationship  between  the 
event  rate  and  a  proxy  of  the  active  region's  free  magnetic  energy.  Each  empirical  relationship  is 
determined  from  a  data  set  of  -40,000  active-region  magnetograms  from  -1 300  active  regions  observed 
by  SOHO/Michelson  Doppler  Imager  (MDI)  that  have  known  histories  of  flare,  coronal  mass  ejection, 
and  solar  energetic  particle  event  production.  The  new  tool  automatically  extracts  each  strong-field 
magnetic  area  from  an  MDI  full-disk  magnetogram,  identifies  each  as  a  NOAA  active  region,  and 
measures  the  proxy  of  the  active  region's  free  magnetic  energy  from  the  extracted  magnetogram.  For 
each  active  region,  the  empirical  relationship  is  then  used  to  convert  the  free-magnetic-energy  proxy 
into  an  expected  event  rate.  The  expected  event  rate  in  turn  can  be  readily  converted  into  the  probability 
that  the  active  region  will  produce  such  an  event  in  a  given  forward  time  window.  Descriptions  of 
the  data  sets,  algorithm,  and  software  in  addition  to  sample  applications  and  a  validation  test  are 
presented.  Further  development  and  transition  of  the  new  tool  in  anticipation  of  SDO/HMI  are  briefly 
discussed. 
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1.  Introduction 

[2]  NASA's  Space  Radiation  Analysis  Group  (SRAG)  at 
Johnson  Space  Center  (JSC)  is  responsible  for  monitoring 
and  forecasting  radiation  levels  for  astronauts.  Solar  par¬ 
ticle  event  (SPE)  forecasting  is  critical  since  SPEs  can 
result  in  large,  sudden,  and  unexpected  increases  in 
radiation  levels  the  astronauts  experience  while  conduct¬ 
ing  space  walks  (Neal  Zapp  Space  Weather  Week, 
2010;  available  at  http://helios.swpc.noaa.gov/sww/2010/ 
wednesday/ZAPP%20SWW%202010.ppt).  SRAG  needs  both 
the  capability  of  forecasting  that  the  necessary  conditions 
for  an  SPE  will  not  exist  (all-clear  forecast),  and  the 
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capability  of  forecasting  the  probability  that  an  SPE  will 
occur.  According  to  Reames  [1999],  SPEs  come  in  two  basic 
types:  impulsive  3He  rich  events  produced  by  flares,  and 
more  gradual  SPEs  that  are  produced  by  shock  fronts  of 
fast  coronal  mass  ejections  (CMEs)  [Reames,  1999].  For  the 
latter,  the  shock  front  is  broad,  and  a  large  part  of  the 
heliosphere  can  be  showered  in  particles,  resulting  in  only 
a  weak  longitudinal  dependence  of  the  number  and 
strength  of  SPEs  observed  at  Earth  from  western  hemi¬ 
sphere  source  regions.  The  former,  in  contrast,  eject  par¬ 
ticles  into  a  smaller  portion  of  the  heliosphere  and  the 
source  regions  of  impulsive  SPEs  observed  at  Earth  tend 
to  be  located  near  where  the  magnetic  field  lines  that 
connect  the  Earth  to  the  Sun  originate,  -60  degree  west 
[Reames,  1999,  Figure  2.3].  Some  multispacecraft  observa¬ 
tions,  though,  indicate  a  wider  injection  spread  [Wiedenbeck 
et  ah,  2010;  Wibberenz  and  Cane,  2006],  but  the  distribution 
of  source  regions  for  impulsive  SPEs  observed  at  Earth 
still  peaks  in  the  western  hemisphere.  In  any  case,  the  first 
step  in  forecasting  a  SPE  is  to  forecast  its  drivers:  major 
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flares  (X  and  M  class)  and  CMEs,  especially  fast  CMEs. 
Forecasting  X  and  M  flares  and  CMEs  would  be  useful  for 
other  space  weather  forecasters  as  well  as  for  SRAG.  The 
next  step  in  SPE  forecasting  is  to  predict  from  the  type, 
magnitude,  and  heliographic  location  of  the  driver,  the 
magnitude  of  the  resulting  SPE  at  Earth. 

[3]  It  would  be  preferable  to  use  a  physics  driven  model 
for  forecasting  an  active  region's  probability  of  producing 
a  major  flare  or  CME.  Because  there  is  no  agreement  on 
how  CMEs  are  triggered  or  powered,  we  instead  use  an 
empirical  approach.  From  the  observed  performance  of 
previous  similar  active  regions  and  the  measured  condi¬ 
tion  of  an  active  region's  magnetic  field,  we  determine  the 
probability  or  expected  rate  of  occurrence  of  X-class  flares, 
X-  and  M-class  flares,  CMEs,  fast  CMEs,  and  SPEs  to  be 
produced  by  the  active  region.  As  such,  the  present  tool  is 
only  the  first  step  for  SPE  forecasting. 

[4]  Present  major  flare  and  CME  forecasting  techniques 
used  by  NOAA  rely  on  the  McIntosh  active  region  clas¬ 
sification  scheme  [McIntosh,  1990].  The  technique  works 
by  having  an  observer  classify  an  active  region  by  sunspot 
presence,  sunspot  size,  leading  spot  penumbra  develop¬ 
ment,  and  sunspot  distribution.  There  are  60  different 
active  region  classifications  that  can  be  assigned,  and 
there  are  empirically  determined  event  rates  for  each 
category.  Since  some  of  the  active  region  classes  are  rare, 
the  statistics  of  the  empirical  rates  for  these  classes  tend  to 
be  poor. 

[5]  It  is  well  known  that  active  regions  that  display 
obvious  magnetic  nonpotentiality  (or  stored  free  magnetic 
energy)  are  much  more  productive  of  CMEs  and  flares 
than  are  active  regions  that  show  little  or  no  non¬ 
potentiality  [e.g.,  Zirin  and  Liggett,  1982;  Zirin,  1988;  Canfield 
et  al,  1999].  This  makes  forecasting  based  on  the  amount 
of  energy  stored  in  the  coronal  magnetic  field  a  reasonable 
approach.  This  is  the  scientific  basis  of  our  empirical 
forecasting  technique.  We  do  not  assume  that  flare  and 
CME  rates  depend  only  on  the  free  magnetic  energy,  but 
do  assume  that  these  rates  should  be  correlated  with  the 
free  magnetic  energy.  It  is  likely  that  the  production  rates 
depend  on  the  free  magnetic  energy  in  combination  with 
other  important  parameters.  On  one  hand,  a  technique 
based  solely  on  the  free  magnetic  energy  cannot  distin¬ 
guish  active  regions  of  different  event  productivity  that 
have  the  same  free  magnetic  energy.  On  the  other  hand, 
because  the  free  magnetic  energy  is  known  to  be  one  of 
the  dominant  determinants,  we  should  expect  a  strong 
positive  correlation  between  free  magnetic  energy  and 
event  rate  without  having  to  specifically  account  for  the 
other  conditions. 

[6]  Ideally,  a  direct  measure  of  an  active  region's  free 
magnetic  energy  in  the  corona  should  be  used  for  fore¬ 
casting  purposes.  Using  the  virial  theorem.  Low  [1982] 
showed  how  this  energy  can  be  measured  from  an  ideal 
vector  magnetogram  from  a  level  of  the  active  region  at 
and  above  which  the  magnetic  field  is  force  free.  However, 
this  approach  cannot  yet  be  used  because  all  routinely 
provided  vector  magnetograms  currently  available  are  too 


inaccurate,  or  are  of  the  non-force-free  photosphere,  or 
both  [e.g.,  Gary  et  al.,  1987;  Klimchuk  et  al.,  1992;  Schrijver 
et  al,  2008].  Instead,  we  can  use  proxies,  or  indirect  mea¬ 
sures,  of  the  free  magnetic  energy,  which  we  expect  to  be 
well  correlated  with  the  free  magnetic  energy  but  with  no 
expectation  of  a  linear  relationship  between  the  two.  One 
proxy  of  sheared  or  nonpotential  magnetic  fields  are  sig- 
moids  [Canfield  et  al,  1999],  which  are  S  or  inverse  S  shaped 
coronal  X-ray  features.  Sigmoids  are  not  quantitative 
measures  since  a  sigmoid  is  either  evident  or  not,  and 
sometimes  they  become  evident  only  during  the  CME 
event;  so  they  are  of  limited  use  for  forecasting.  Another 
proxy  is  the  presence  of  a  delta  sunspot  in  the  active 
region.  A  delta  sunspot  contains  two  opposite  polarity 
umbras  that  share  the  same  penumbra.  This  is  partly  the 
basis  of  the  McIntosh  classification  scheme,  but  like  sig¬ 
moids,  an  active  region  either  does  or  does  not  have  a 
delta  sunspot.  Most  magnetic  proxies  of  the  free  magnetic 
energy  require  a  vector  magnetogram  (a  magnetogram 
that  maps  both  the  line-of-sight  field  and  the  transverse 
field).  There  are  several  related  proxy  tree-energy  mea¬ 
sures  that  are  based  on  having  strong  gradients  in  the 
vertical  component  of  the  field  across  the  neutral  lines  (the 
lines  that  separate  the  positive  and  negative  polarities  of 
an  active  region),  and  hence  can  be  applied  to  line-of-sight 
magnetograms  [Falconer,  2001;  Falconer  et  al,  2002,  2003, 
2006,  2008,  2009;  Jing  et  al,  2006;  Georgoidis  and  Rust,  2007; 
Schrijver,  2007].  For  some  unknown  reason,  when  the  Sun 
produces  strong  vertical-field  gradients  across  a  neutral 
line,  the  Sun  nearly  always  strongly  shears  the  magnetic 
field  along  the  neutral  line.  The  most  extreme  cases  pro¬ 
duce  delta  sunspots.  Delta  spots  have,  of  course,  very 
strong  vertical-field  gradients  across  the  neutral  line  that 
separates  the  opposite  polarity  umbras.  They  also  have 
very  strongly  sheared  field  along  the  neutral  line.  Falconer 
et  al  [2008]  have  shown  that  the  strong-gradient  neutral¬ 
line  measures  used  in  the  present  forecasting  tool  is  well 
correlated  with  tree-energy  proxies  measured  from  the 
transverse  field  (either  shear  angle  or  net  current  flowing 
from  one  polarity  to  the  other).  By  this  correlation  our 
strong-gradient  neutral-line  measure  is  also  a  proxy 
measure  of  an  active  region's  free  magnetic  energy.  The 
reason  the  neutral-line  gradient  type  of  measures  are  of 
special  interest  is  the  availability  of  large  databases  of 
consistent,  good  cadence  magnetograms  from  space. 
SOHO/Michelson  Doppler  Imager's  (MDI)  line-of-sight 
magnetograph  [Scherrer  et  al,  1995],  which  has  been  taking 
full  disk  magnetograms  at  a  cadence  of  15  per  day  since 
1996,  is  a  prime  example.  To  determine  reliable  empirical 
relationships  between  a  proxy  of  free  magnetic  energy  and 
an  active  region's  rates  of  production  of  either  flares  or 
CMEs  requires  a  large  data  set.  For  this  study,  space- 
based  observations  have  several  advantages  over  ground 
based  observations.  These  include  24  h  coverage,  only  one 
instrument,  and  no  errors  due  to  seeing.  SOHO/MDI  is 
not  the  only  magnetograph  in  space;  there  is  also  the 
vector  magnetograph  on  Hinode.  But  since  Hinode  was 
launched  in  2006  as  Cycle  23  was  heading  toward  mini- 
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mum  the  number  of  observed  active  regions  is  much 
smaller.  Also,  the  Hinode  magnetograph's  field  of  view 
covers  only  single  active  regions.  As  a  result  it  does  not 
consistently  observe  every  active  region  on  the  disk,  which 
produces  biases  in  the  data.  There  now  is  SDO/HMI,  a 
full-disk  vector  magnetograph  (http://hmi.stanford.edu/) 
that  will  be  replacing  SOHO/MDI,  but  it  will  be  years 
before  a  database  comparable  to  the  current  one  from 
MDI  is  available.  Since  SOHO/MDI  will  be  replaced,  we 
must  prepare  to  transition  what  we  learn  from  the  SOHO/ 
MDI  database  to  the  better  instrument,  SDO/HMI,  as 
discussed  in  section  6. 

[7]  Using  a  neutral-line  gradient  type  of  free-magnetic- 
energy  proxy  measured  from  line-of-sight  magnetograms 
has  the  following  disadvantage.  The  physical  magnetic 
field  of  interest  is  the  vertical  magnetic  field.  Only  when 
active  regions  are  near  disk  center  is  the  line-of-sight 
magnetic  field  a  good  approximation  of  the  vertical  mag¬ 
netic  field.  Beyond  approximately  30-40  heliocentric 
degrees,  fictitious  neutral  lines  can  occur  that  are  due  to 
projection  effects,  and  some  of  these  can  have  large 
apparent  gradients.  For  our  empirical  fitting  purpose,  we 
therefore  limit  our  data  set  to  only  magnetograms  of  active 
regions  observed  within  30  heliocentric  degrees.  Further, 
our  free-magnetic-energy  proxy  was  developed  for 
strong-field  active  regions  where  the  transverse  magnetic 
field  could  be  measured  using  MSFC  vector  magneto¬ 
grams.  Thus  our  proxy  is  not  designed  to  determine  the 
free  magnetic  energy  of  large-scale  weak-field  magnetic 
flux  concentrations  in  the  quiet  sun;  these  are  old  decay¬ 
ing  active  regions  and  can  give  rise  to  quiet  sun  promi¬ 
nence  eruptions.  However,  since  the  most  powerful  flares/ 
CMEs  typically  originate  in  strong-field  active  regions, 
concentrating  on  forecasting  active  region  events  is, 
clearly,  a  good  starting  point. 

2.  Description  of  the  Databases 

[8]  To  develop  and  test  a  forecasting  tool  we  need  to 
determine  the  empirical  rates  as  a  function  of  the  free- 
magnetic-energy  proxy.  To  this  end,  we  need  accurate 
flare/CME/SPE  production  histories  of  a  large  number  of 
active  regions  and  a  time  series  of  each  active  region's 
free-magnetic-energy  proxy.  A  list  of  SPEs  and  their 
sources  (http://umbra.nascom.nasa.gov/SEP/)  has  been 
developed  by  NOAA  and  was  used  for  this  study.  As  long 
as  there  are  full-disk  coronal  images,  the  largest  flares,  X- 
and  M-class  flares,  can  normally  each  be  assigned  to  an 
active  region.  Some  C-class  flares  are  not  associated  with 
active  regions,  and  during  solar  maximum  some  C-class 
flares  might  not  be  detected  because  the  X-ray  background 
of  the  entire  Sun  is  often  mid-C  level.  So  we  limit  our 
forecast  to  X-  and  M-class  flares.  CMEs  are  seen  in  SOHO/ 
LASCO  movies  (http://cdaw.gsfc.nasa.gov/CME_list/).  CMEs 
can  either  be  frontside  or  backside  events.  We  start  with  a 
flare  and  CME  catalog  from  C.  Balch  (private  communi¬ 
cation,  2007)  that  used  NOAA  forecasters  daily  observa¬ 
tions  to  identify  source  regions  of  flares  and  CMEs.  This  is 


a  labor  intensive  process,  and  made  this  project  possible. 
Most  CMEs,  especially  the  more  powerful  ones,  originate 
in  active  regions  along  with  a  major  flare.  From  only 
SOHO/LASCO  observations  we  can  determine  the  por¬ 
tion  of  the  Sun  the  CME  likely  originated  from  but  not  if 
the  event  was  a  frontside  or  backside  event.  In  other 
words,  when  we  see  a  CME  that  is  seen  to  be  rapidly 
growing  in  width,  emerging  above  the  west  limb  of  the 
LASCO  occulting  disk,  its  source  must  be  near  the  west 
limb  and  not  near  the  east  limb  or  disk  center.  The  source 
region  could  be  an  active  region  that  has  just  rotated  over 
the  west  limb  (backside),  on  the  limb,  or  will  soon  rotate 
around  the  west  limb  but  is  on  the  disk  of  the  Sun 
(frontside).  The  flare  accompanying  a  backside  CME  will 
not  be  seen  by  GOES,  while  either  a  limb  or  a  frontside 
CME  will  have  a  GOES  signature.  It  is  important  to  con¬ 
firm  the  source  region  of  a  flare  since  during  solar  maxi¬ 
mum  the  Sun  can  produce  many  flares,  and  sometimes  a 
flare  is  falsely  assigned  to  a  CME.  In  other  words  a  flare 
might  occur  in  an  active  region  near  the  east  limb  or  disk 
center,  and  a  west  limb  CME  is  seen  and  falsely  assigned, 
to  the  wrong  active  region.  Full-disk  coronal  images  from 
various  instruments  SOHO/EIT,  Yohkoh/SXT,  GOES/SXI 
can  be  used  to  confirm  or  refute  these  assignments.  We 
double  checked  every  X-  and  M-class  flare  and  CME  that 
was  important  for  our  study,  e.g.,  that  it  came  from  one  of 
our  active  regions  and  had  occurred  during  the  24  h  after 
the  time  of  one  of  our  magnetograms.  As  such,  we  do  not 
need  to  check  flares  or  CMEs  that  are  assigned  to  one  of 
our  active  regions  but  occurred  more  than  24  h  after  the 
active  region  left  the  central  disk  area  where  we  make  our 
magnetic  measurements.  We  have  eliminated  most  of  the 
falsely  assigned  CMEs  by  finding  either  the  timing  being 
wrong  (CME  seen  in  LASCO  before  flare  starts)  or  that  the 
CME  that  obviously  originated  near  the  solar  limb  but  was 
assigned  to  a  central-disk  active  region  flare.  Occasionally 
a  quiet  Sun  prominence  eruption  (CME)  was  falsely 
associated  with  a  nearby  active  region. 

[9]  The  time  series  of  each  active  region's  free-magnetic- 
energy  proxy  was  determined  using  an  automated  algo¬ 
rithm  that  extracts  from  full-disk  MDI  magnetograms 
strong  magnetic  field  areas,  identifies  them  with  NOAA's 
active  regions,  and  then  measures  our  proxy  of  free 
magnetic  energy.  This  automated  capability  is  critical  to 
our  new  forecasting  tool.  We  have  applied  this  algorithm 
to  all  full-disk  MDI  magnetograms;  however,  for  purposes 
of  the  results  and  analysis  presented  in  this  work,  the 
effective  end  date  is  December  2004  which  corresponds  to 
the  end  date  of  Balch's  flare/CME  database.  Our  focus  is 
to  evaluate  flare  and  CME  rates  as  functions  of  only  our 
proxy  of  free  magnetic  energy  of  isolated  active  regions. 
There  are  cases,  however,  when  two  or  more  NOAA  active 
regions  are  included  in  a  single,  extracted  strong-field 
magnetic  area.  For  simplicity,  these  particular  cases  have 
been  excluded  as  they  represent  only  ~15%  of  the  mag¬ 
netic  islands  corresponding  to  active  regions.  We  also 
exclude  strong-field  magnetic  areas  that  are  not  NOAA 
active  regions,  since  Balch's  flare/CME  database  only 
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includes  events  from  NOAA  active  regions.  The  combined 
database  we  have  developed  runs  from  the  date  of  the  first 
MDI  active-region  magnetogram  (10  May  1996)  through 
the  last  entry  of  the  flare/CME  catalog  (24  December 
2004).  The  data  set  consists  of  ~40,000  magnetograms  from 
~1 300  active  regions  with  known  flare,  CME,  and  SPE 
production  histories.  Using  this  large  combined  database, 
we  are  able  to  discern  power  law  dependence  between  our 
proxy  of  the  free  magnetic  energy  of  an  active  region  and 
the  active  region's  flare  rate,  its  CME  rate,  its  fast  CME 
rate,  and  its  SPE  rate.  These  rates  were  incorporated  into 
our  beta  forecasting  tool,  currently  being  tested  by  NASA/ 
SRAG.  We  also  plan  to  apply  these  techniques  to  SDO/ 
HMI  full-disk  magnetograms. 

2.1.  Magnetic  Measurements  Database 

2.1.1.  NOAA  Active  Regions,  Magnetic  Islands, 
and  Magnetic  Measures 

[10]  In  a  previous  study  [Falconer  et  al,  2009],  we  man¬ 
ually  selected  a  subfield  of  view  of  a  full  disk  MDI  mag¬ 
netogram  that  encompassed  only  the  active  region  of 
interest.  This  selected  subfield  was  shifted  from  magne¬ 
togram  to  magnetogram  to  track  solar  rotation  and  thus 
develop  a  time  series  of  magnetic  measures  for  each  active 
region.  This  is  a  reasonable  approach  for  a  sample  of 
44  active  regions,  but  applying  this  technique  to  a  large 
sample  would  be  very  time  consuming.  Also,  a  forecasting 
tool  would  need  to  be  able  to  automatically  identify  a 
reasonable  subfield  of  view  to  enclose  one  active  region. 
We  have  developed  an  automated  algorithm  to  use  for 
both  scientific  studies  and  forecasting.  This  allows  us  to 
apply  various  conditions  depending  on  the  quality  of 
the  inputs.  The  inputs  are  MDI  magnetograms  and 
NOAA  active  region  lists  (http://www.swpc.noaa.gov/ 
ftpdir/forecasts/SRS/).  The  technique  identifies  contigu¬ 
ous  sets  of  pixels  with  strong  magnetic  field,  which  appear 
on  plots  like  islands  in  a  sea;  so  we  call  them  "magnetic 
islands."  We  rather  call  them  magnetic  islands  than 
active  regions,  since  they  can  contain  zero,  one  or  more 
active  regions.  Our  free-energy  proxy  is  tailored  to  active 
regions,  which  are  the  predominant  source  of  flares  and 
CMEs,  so  we  need  to  remove  from  the  list  those  magnetic 
islands  that  are  not  active  regions  as  described  below.  A 
subfield  of  view  of  the  full-disk  magnetogram  is  needed  to 
evaluate  our  magnetic  measures.  This  is  accomplished 
using  a  polygon  such  that  all  magnetogram  pixels 
enclosed  by  the  polygon  are  used  to  evaluate  our  magnetic 
measures.  The  addition  of  a  portion  of  the  quiet  sun  has 
negligible  effect  on  our  magnetic  measures,  but  the 
inclusion  of  a  portion  of  another  active  region  could  lead 
to  an  overestimation  of  an  active  region's  free  magnetic 
energy.  Therefore,  any  portion  of  another  magnetic  island 
is  excluded  leaving  each  magnetic  island  enclosed  in  a 
polygon  that  encloses  one,  and  only  one  island.  Some 
magnetic  islands  include  two  or  more  NOAA  active 
regions;  they  are  excluded  from  the  present  study  because 
of  complications  they  pose.  We  have  plans  for  future 
studies  to  refine  the  tool  to  include  them.  At  present  the 


forecast  tool  treats  them  as  one  active  region  for  evalua¬ 
tion  and  forecasting  event  rates  using  the  conversion 
function  obtained  from  isolated  active  regions. 

2.1.2.  Magnetic  Island  Identification  Algorithm 

[n]  To  identify  active  regions  the  following  algorithm  is 
used.  We  mask  the  limb  (>0.95  Rs)  to  avoid  limb  effects. 
We  then  smooth  the  logarithm  of  the  magnitude  of  the 
line-of-sight  magnetic  field  with  a  Gaussian  smoother 
[Gonzalez  and  Woods,  1992]  that  has  a  full  width  at  half  max 
of  12  pixels.  We  apply  a  25  G  threshold  to  the  5  min 
average  MDI  magnetograms  and  35  G  to  the  noisier  1  min 
average  MDI  magnetograms.  This  process  leaves  a  large 
number  of  strong  field  islands  including  active  regions, 
plage,  and  ephemeral  active  regions.  Narrower  Gaussian 
smoothers  were  tried  initially,  but  often  active  regions 
would  be  divided  into  two  separate  parts.  We  increased 
the  width  so  that  we  would  not  divide  active  regions  into 
two  different  islands.  As  a  consequence  the  number 
of  islands  having  multiple  NOAA  active  regions  was 
increased. 

[12]  To  eliminated  magnetic  islands  that  are  not  sunspot 
active  regions  because  they  are  too  small  or  the  field  is  too 
week,  we  keep  only  islands  that  fulfill  two  conditions: 
(1)  the  island  has  a  maximum  line-of-sight  field  greater 
than  750  G  and  (2)  the  island  has  an  area  of  over  50  MDI 
pixels  (~200  arcsec2).  The  numerical  values  of  all  these 
thresholds  and  parameters  were  empirically  determined, 
and  can  be  modified  by  the  forecaster,  but  are  used  con¬ 
sistently  to  develop  the  database.  All  islands  that  meet 
these  requirements  were  numbered. 

2.1.3.  Enclosing  Magnetic  Islands  With  Polygons 

[13]  Next,  we  enclose  each  island  with  a  polygon  in 
which  all  pixels  are  measured  in  evaluating  the  magnetic 
measures.  The  initial  polygons  are  rectangles  that  barely 
enclose  the  islands,  and  are  recorded  as  a  list  of  five  ver¬ 
tices  (the  bottom  right  corner  is  both  the  first  and  last 
vertex  of  the  list).  For  cases  where  two  polygons  overlap, 
the  overlapping  area  is  subdivided  so  that  each  polygon 
encloses  one  and  only  one  magnetic  island.  The  key 
requirement  of  this  process  is  that  neither  polygon 
encloses  any  portion  of  the  other  magnetic  island  as  this 
would  affect  our  magnetic  measures.  This  is  done  by 
changing  the  vertices  list.  The  code  first  determines  if 
either  magnetic  island  extends  into  the  overlapping  area. 
If  no  magnetic  island  extends  into  the  area,  the  first 
magnetic  island  examined  has  its  polygon  modified.  If  one 
magnetic  island  extends  into  the  area  then  the  other 
polygon  vertices  are  modified  so  as  to  exclude  the  over¬ 
lapping  area.  If  both  islands  extend  into  the  polygon  then 
both  vertices'  lists  are  modified  by  adding  the  vertices  that 
described  the  "coastline"  of  the  second  island.  This  pro¬ 
cess  is  repeated  for  all  overlapping  areas. 

2.1.4.  Assigning  NOAA  Active  Region  Numbers 
to  Magnetic  Islands 

[14]  The  NOAA  active-region  list  for  the  day  is  then 
used  to  assign  active  region  numbers.  This  is  done  in  two 
steps.  First,  if  the  location  that  NOAA  gives  for  the  active 
region  falls  inside  one  of  the  magnetic  island's  polygons. 
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it  is  assigned  that  number,  or  numbers  for  the  case  of 
multiple  active  regions.  Second,  if  the  NOAA-reported 
location  of  an  active  region  falls  outside,  but  near  a  mag¬ 
netic  island  without  an  assigned  NOAA  active  region 
number,  it  is  assigned  the  number.  The  second  is  most 
common  for  small  active  regions,  since  NOAA's  locations 
are  given  in  whole  heliographic  degrees. 

2.1.5.  Evaluating  Magnetic  Measures  and  Proxies 
[15]  For  each  magnetic  island,  each  magnetic  measure  is 
evaluated.  The  proxy  of  free  magnetic  energy  takes 
advantage  of  the  observation  that  nonpotential  or  sheared 
magnetic  field  tends  to  build  up  along  magnetic  neutral 
lines,  and  magnetic  neutral  lines  that  vector  magneto¬ 
grams  observe  to  have  strong  gradients  across  them  in  the 
vertical  magnetic  field  nearly  always  have  strongly 
sheared  horizontal  field  along  them.  By  integrating  the 
gradients  of  the  vertical  magnetic  field  along  the  neutral 
lines  we  obtain  the  weighted  length  of  the  strong-gradient 
neutral  line  (denoted  by  lWLsg).  To  evaluate  it  with  a 
line-of-sight  magnetogram  (Figure  1),  we  use  the  line-of- 
sight  approximation  treating  the  line-of-sight  field  as  if  it 
were  the  vertical  field  and  limit  the  analysis  to  active 
regions  within  30  heliocentric  degrees  of  disk  center. 
Unable  to  measure  the  horizontal  field  with  a  line-of-sight 
magnetogram,  we  instead  use  the  transverse  potential 
magnetic  field  extrapolated  from  the  line-of-sight  mag¬ 
netic  field  to  limit  our  neutral  lines  to  neutral  lines  with 
strong  horizontal  magnetic  fields.  This  is  done  as 
described  by  Falconer  et  al.  [2006,  2008]  where  the  mag¬ 
netic  measure  lWLSg  is  defined  as 

lWLsg  =  J\V±Bl0S\dl:  (1) 

where  V  1  Btos  is  the  transverse  gradient  of  the  line-of-sight 
magnetic  field,  and  the  integral  is  taken  over  all  neutral¬ 
line  increments  dl  (Figure  1)  on  which  the  potential 
transverse  field  computed  from  the  magnetogram  is 
>150  G.  The  above  is  applied  only  to  that  part  of  the  MDI 
magnetogram  that  is  enclosed  by  the  polygon  that 
encloses  the  magnetic  island.  This  integral  is  evaluated 
numerically  by  dividing  the  neutral  line  into  multiple 
increments,  each  roughly  a  pixel  in  length.  For  each 
increment  we  determine  the  potential  transverse  field  and 
transverse  gradient  at  the  midpoint  of  the  increment  by 
interpolation  from  the  values  for  each  pixel  of  the  mag¬ 
netogram.  For  those  increments  with  potential  field  larger 
than  150  G  the  product  of  the  increment's  length  and 
gradient  is  summed  over  all  strong-field  neutral  lines. 

[ie]  Other  magnetic  measures  are  also  determined  at 
this  stage.  Two  other  important  magnetic  measures  used 
in  this  paper  are  (1)  the  magnetic  area,  A,„,  defined  as 

A  m =  /  da,  (2) 

J  |Bfos|  >  100  G 

where  IBtosl  is  the  strength  of  the  line-of-sight  magnetic 
field  and  the  integral  is  taken  over  all  areas  of  the  mag¬ 


netogram  for  which  IB;osl  >  100  G,  and  (2)  the  length  of  the 
strong-field  neutral  line,  lLs,  defined  as 

lLs  =  [  dl,  (3) 

JpB,  >  150  G 

where  the  integral  is  taken  over  all  neutral-line  incre¬ 
ments  dl  on  which  the  potential  transverse  field,  pBf/ 
computed  from  the  magnetogram  is  >150  G.  An  example 
of  an  MDI  active-region  magnetogram  and  its  strong-field 
intervals  of  neutral  lines  is  shown  in  Figure  1. 

[17]  For  each  magnetic  island,  the  vertices  of  the  poly¬ 
gon  used,  the  associated  NOAA  active  region  number, 
and  the  magnetic  measures  are  included  in  our  database 
for  analysis.  The  NOAA  active  region  number  is  used  only 
to  associate  flares,  CMEs,  and  SPEs  for  obtaining  our 
forecasting  curves  (section  2.2);  this  step  is  not  needed  for 
the  forecasting.  Currently,  this  database  extends  from  May 
1996  through  the  present,  but  due  to  the  event  catalog 
ending  in  December  2004  (section  2.2),  in  this  paper  we 
use  only  data  for  May  1996  through  December  2004. 

2.2.  Flare/CME/SPE  Database 

[is]  NOAA  has  both  a  Flare/CME  list  (C.  Balch,  private 
communication,  2007)  and  an  SPE  list  (http://umbra.nascom. 
nasa.gov/SEP/).  The  flare/CME  database  runs  through 
December  2004,  and  the  SPE  database  had  its  last  entry  in 
2006  (the  last  recorded  SPE  of  Cycle  23).  The  flare  list  has 
assigned  individual  flares  to  particular  NOAA  active 
regions,  and  for  each  flare  it  lists  whether  it  occurred 
together  with  a  CME.  The  magnitude  and  start  time  of  the 
flare  as  well  as  the  speed  of  the  accompanying  CME  is 
recorded  in  our  database.  The  SPE  list  has  active  region 
assignment,  start  time  of  flare,  magnitude  of  associated  flare, 
and  magnitude  of  SPE  for  >10  MeV  protons  with  flux  above 
10  pfu  (1  pfu  =  1  particle/cm2/s/sr).  We  cross  reference  these 
two  databases  with  our  MDI  magnetic  island  list,  using 
active  region  numbers,  to  compile,  for  each  active  region,  an 
event  history  as  well  as  a  magnetic  measures  history  to 
produce  the  combined  data  set,  which  is  the  basis  of  our 
forecasting  curves. 

[19]  We  confirmed  the  association  of  each  flare,  CME 
and  SPE  in  the  database  that  could  affect  our  results  (i.e., 
that  occurred  in  an  active  region  within  24  h  of  a  central 
disk  magnetogram).  This  was  done  by  checking  LASCO, 
SXT,  EIT,  and  SXI  movies,  and  GOES  timing  found  at 
(http://sxi.ngdc.noaa.gov/  and  http://cdaw.gsfc.nasa.gov/ 
CME_list/).  The  most  common  correction  to  the  database 
came  from  finding  that  a  CME  that  originated  from  just 
beyond  the  limb  (backside  event)  had  been  falsely  asso¬ 
ciated  with  a  flare  occurring  in  an  active  region  near  disk 
center. 

3.  Forecasting  Technique 

[20]  To  determine  the  expected  empirical  event  rate  as  a 
function  of  our  free-magnetic-energy  proxy  we  use  the 
isolated  active  regions  in  the  combined  database  (MDI 
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Figure  1.  An  active-region  line-of-sight  magnetogram  (AR  9077  14  July  2000)  from  MDI  and  the 
strong-field  neutral  lines  from  which  our  proxy  of  the  active  region's  free  magnetic  energy  is  eval¬ 
uated.  An  active  region  is  composed  of  strong  positive  (white)  and  negative  (black)  magnetic  field 
concentrations.  Separating  the  positive  and  negative  field  are  magnetic  neutral  lines,  at  which  mag¬ 
netic  field  of  opposite  polarity  can  cancel.  The  strong-field  intervals  of  the  neutral  lines  are  colored 
red  (see  text). 


magnetic  measures  combined  with  NOAA  event  history 
as  described  in  section  2).  We  use  only  magnetic  islands 
with  only  one  assigned  NOAA  active  region,  and  only 
when  the  active  region  is  within  30  heliocentric  degrees  of 
disk  center.  These  two  conditions  are  applied  since  we 
need  to  avoid  cases  where  our  measurements  sum  over 
more  than  one  active  region,  and  we  want  to  include  only 
cases  where  projection  effects  are  acceptably  small 
[Falconer  et  al,  2008].  Our  magnetic  measure,  lWLsg,  is 
designed  to  indirectly  measure  the  free  magnetic  energy 
of  strong-field  active  regions,  that  is  active  regions  in 
which  enough  of  the  neutral  line  has  potential  transverse 
field  that  is  strong  (>150  G).  This  measure  is  not  a  good 
proxy  of  the  free  magnetic  energy  of  old  decaying  active 
regions  that  have  lost  their  sunspots.  To  exclude  these 
decaying  active  regions  we  require  that  the  length  of  the 
strong-field  neutral  line,  Ls,  divided  by  the  square  root  of 
the  magnetic  area  of  the  active  region  is  greater  than  0.75. 
Our  set  of  magnetic  islands  (now  measured  active  regions) 
that  fulfill  these  conditions  consists  of  39,977  magneto¬ 
grams  from  1329  active  regions  observed  between  10  May 
1996  and  25  December  2004,  which  period  spans  part  of 


the  solar  cycle  22-23  minimum  phase  and  the  maximum 
phase  of  solar  cycle  23. 

[21]  To  determine  the  dependence  of  an  event  rate  on 
our  proxy  of  free  magnetic  energy,  where  event  rate  can 
mean  X  flare  rate,  X  and  M  flare  rate,  CME  rate,  fast-CME 
rate  (fast  CMEs  are  CMEs  with  plane-of-sky  velocity  of 
greater  than  800  km/s),  or  SPE  rate,  we  bin  our  sample  in 
bins  of  increasing  lWLsg-  For  each  bin  we  determine  the 
average  lWLsg,  the  number  of  CMEs,  X  and  M  flares,  fast 
CMEs,  and  SPEs  that  occur  during  the  24  h  period  after 
the  time  of  the  active-region  magnetogram.  The  number 
of  counted  events  divided  by  the  number  of  active-region 
magnetograms  in  a  bin  is  the  24  h  event  rate  for  that  bin. 
Using  Poisson  statistics  [Sachs,  1978],  we  then  determine, 
for  each  bin  and  event  type,  the  1-sigma  uncertainty  in  the 
event  rate. 

[22]  We  divided  our  data  into  40  equally  populated  bins 
(~1000  active-region  magnetograms  per  bin)  for  reason¬ 
able  statistics  per  bin.  For  each  kind  of  event  the  average 
lWLsg,  event  rate,  and  uncertainty  of  the  event  rate  are 
log-log  plotted  in  Figure  2.  At  this  point,  to  convert  a  new 
active  region's  measured  lWLsg,  to  the  active  region's 
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Figure  2.  Log-log  plots  of  event  rates  versus  the  free-magnetic-energy  proxy.  Each  bin  value 
(asterisks),  its  rate  uncertainty  (I),  and  the  power  law  fits  (red  lines)  are  shown.  Blue  dashed  lines 
are  0.01  events  per  24  h  threshold  used  in  the  fits  (see  text). 


forecasted  event  rate,  we  could  use  a  lookup  table, 
assigning  the  active  region  the  expected  event  rate  for  the 
bin  that  lWLsg  value  falls  in.  For  an  active  region  of  this 
range  of  lWLsg  the  expected  event  rates  are  determined 
from  our  database.  We  rejected  this  technique  since  the 
event  rates  fluctuate  from  bin  to  bin,  not  consistently 
monotonically  increasing.  This  would  result  in  ranges  of 
lWLsg  where  as  lWLsg  increases  the  predicted  rate 
would  decrease.  Instead,  and  since  we  do  have  an 
approximate  power  law  relationship  (Figure  2),  we 
decided  to  fit  the  data  with  a  power  law  and  estimate  the 
forecasted  rate  based  on  the  measured  lWLsg,  its  uncer¬ 
tainty,  the  power  law  fit  and  the  uncertainty  in  the  power 
law  fit.  The  power  law  fit,  for  each  type  of  event,  is 
determined  only  from  those  bins  with  event  rates  of  0.01 
events  per  24  h  (>10  events  per  1000  magnetograms).  Only 
bins  with  rates  greater  than  0.01  are  used  because  for  bins 
with  no  events  the  upper  limit  of  the  1-sigma  uncertainty 
is  just  over  0.015/d.  The  fit  is  in  the  form 

R  =  a(LWLSG)b.  (4) 

[23]  The  fit  parameters,  their  uncertainties,  and  the  fit's 
reduced  chi-square  value  are  given  in  Table  1.  The  best  fit 
is  for  major  flares,  with  the  fits  for  the  other  event  types 
having  larger  reduced  chi-square  values  and  typically 
larger  uncertainties.  The  uncertainty  in  a  is  minimized 
by  dividing  lWLsg  by  50,000  G  so  that  the  log-y  inter¬ 


cept  is  near  the  centroid  of  the  fitted  bins.  Relative  to 
the  uncertainties  in  measured  values  of  lWLsG,  the 
uncertainties  in  the  fitting  parameters  dominate  the 
uncertainty  in  the  forecast  event  rate  and  are  symmetrical 
in  log-space,  resulting  in  a  multiplicative  factor  of  the  rate 
(e.g.,  for  a  1-sigma  uncertainty  of  a  factor  of  2,  the  1-sigma 
range  extends  from  twice  the  rate  given  to  half  the  rate 
given). 

[24]  We  can  choose  the  value  of  lWLsg  below  which  we 
make  an  all-clear  forecast.  This  all-clear  ceiling  value  of 
lWLsG  also  determines  the  fraction  of  the  sample  in  the 
all-clear  range  of  lWLsg.  Figure  3  shows  an  example  all- 
clear  ceiling  of  0.05  events  per  24  h  for  each  type  of  event. 
Table  2  lists  the  percent  of  the  sample  below  the  ceiling, 
together  with  the  rate  of  the  bin  of  greatest  lWLsg.  For 
this  example,  we  use  a  threshold  rate  of  0.05  events  per 
day  for  the  chance  of  an  event  to  be  considered  non- 
negligible.  The  semilog  scale  of  Figure  3  is  used  to 
emphasize  that  only  for  a  small  number  of  active  regions 
(large  lWLsg)  is  the  probability  of  an  event  nonnegligible. 

4.  Validating  the  Active-Region  Forecast 

[25]  As  a  preliminary  test  of  the  validity  of  our  fore¬ 
casting  method,  we  have  divided  our  sample  into  two 
groups  separated  chronologically;  all  the  observations  on 
a  given  day  or  before  belong  to  the  first  group  and  all  the 
observations  after  that  day  belong  to  the  second  group. 
From  the  first  group  we  have  determined  the  fitting 
parameters  "a”  and  "b"  of  equation  (4).  With  these  fitting 
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Table  1.  Parameters  for  the  Power  Law  Fits 


Event  Type 

logio  a 

b 

Reduced 

Chi-Square  Value 

X  and  M  flares 

-9.75  ±  0.03 

1.95  ±  0.14 

0.21 

X  flares 

-10.77  ±  0.11 

2.00  ±  0.58 

0.39 

CMEs 

-7.81  ±  0.04 

1.50  ±  0.16 

0.31 

Fast  CMEs 

-8.36  ±  0.07 

1.55  ±  0.29 

0.43 

SPEs 

-8.84  ±  0.12 

1.57  ±  0.59 

0.44 

parameters  we  forecast  the  expected  event  rates  of  major 
flares  for  the  second  group.  Major  flares  were  selected  as 
the  event  type  for  this  preliminary  test  based  on  them 
being  the  most  numerous  type  of  event,  and  hence  having 
the  best  statistics. 

[26]  The  second  group  is  then  binned  in  10  equally 
populated  bins  based  on  these  forecast  rates,  and  for  each 
bin  the  average  forecasted  rate  is  compared  to  the  average 
actual  rate.  Figure  4  shows  a  plot  of  such  a  comparison  for 
before  and  after  30  June  2002,  which  date  roughly  puts 
70%  of  the  active-region  magnetograms  in  the  first  group 
and  30%  in  the  second.  Similar  results  were  obtained 
using  other  dividing  dates.  This  shows  that  our  forecasting 
method  works  quite  well.  The  error  bars  are  calculated 
with  Poisson  statistics  to  estimate  the  likely  range  of  the 
actual  event  rate.  Similar  to  the  fitting  procedure  of  the 
forecast  curves  in  Figures  2  and  3,  in  Figure  4  only  the  five 
bins  with  forecasted  rates  greater  than  0.01/d  are  plotted. 


Table  2.  All  Clear  Fraction  and  Maximum  Rates  for  Different 
Event  Types 


Event  Type 

Percent  All  Clear 

Max  Rate 

X  and  M  flares 

77% 

0.87 

X  flares 

95% 

0.15 

CMEs 

78% 

0.43 

Fast  CMEs 

89% 

0.21 

SPEs 

97% 

0.09 

For  the  highest  forecasted  rate  bin,  the  error  bars  are 
logarithmically  small  due  to  the  large  number  of  M  and  X 
flares.  In  any  particular  bin  there  are  active  regions  whose 
rates  are  larger  or  smaller  due  to  other  factors.  We  expect 
to  improve  the  forecast  by  identifying  and  using  second¬ 
ary  forecast  measures,  as  we  allude  to  in  the  discussion. 


5.  Full-Disk  Forecasting  Tool 

[27]  For  forecasting  an  expected  event  rate  or  probability 
of  event  for  the  whole  face  of  the  Sun,  we  need  to  forecast 
the  expected  rate  of  each  active  region  on  the  solar  disk. 
The  selection  limitations  we  placed  on  our  combined  data 
set  to  determine  empirical  event  rates,  namely,  within 
30  heliocentric  degrees,  and  only  one  active  region  per  a 
strong-field  magnetic  island  (see  section  3)  are  dropped. 
These  restrictions  were  needed  to  derive  the  most  accu¬ 
rate  parameters  for  power  law  fits  (Table  1),  but  the  active 


104  1  05  1  04  1  05  1  04  1  0s 

lWLso  (G) 


Figure  3.  Same  as  Figure  2  but  in  linear-log  space.  Blue  horizontal  lines  depict  the  5%  threshold 
(see  text),  and  the  blue  vertical  lines  are  the  corresponding  threshold  values  of  lWLsg. 
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Figure  4.  Log-log  plot  of  the  actual  rates  versus  the 
expected  rates  for  major  flares  (see  text  for  details). 
The  actual  rates  equal  the  expected  rate  on  the  diagonal 
line. 


regions  beyond  30  degrees  cannot  be  ignored  for  full-disk 
forecasting.  Ignoring  them  would  result  in  an  underesti¬ 
mation  of  the  actual  full-disk  event  rate.  The  forecasts  for 
these  are  more  uncertain  since  fictitious  neutral  lines 
(neutral  lines  in  the  line-of-sight  field  but  not  in  the  real 
vertical  magnetic  field)  will  cause  occasional  erroneously 
large  lWLsg,  and  thus  erroneously  large  forecasted  rates. 
Users  are  thus  warned  when  the  active  region  is  beyond 
30  heliocentric  degrees  and  the  measured  values  of  lWLsg 
are  suspect.  They  are  also  cautioned  in  the  event  that  a 
strong  magnetic  field  area  with  multiple  active  regions  is 
being  treated  as  one  active  region.  This  last  change, 
however,  is  likely  to  have  small  effects  since  typically 
either  most  active  regions  in  the  group  have  little  free 
energy,  and  so  the  group  has  a  small  lWLsg  and  thus  a 
negligible  overall  expected  event  rate,  or  one  active  region 
in  the  group  dominates  lWLsg  as  well  as  the  event  rate. 
Only  when  there  are  two  or  more  active  regions  with 
comparable  and  moderate  to  large  lWLsg  will  we  over¬ 
estimate  the  probability  of  an  event.  Further,  the  case  of 
two  nonpotential  active  regions  that  are  close  and  likely 
connected  by  magnetic  loops,  the  actual  event  rate  might 
differ  from  the  case  of  two  active  regions  that  are  well 
isolated  from  each  other.  We  plan  to  determine  in  future 
research  whether  their  event  rates  change  but  for  now 
assume  there  is  no  effect. 

[28]  Sample  results  of  the  present  forecast  tool  are 
shown  in  Figure  5  for  29  October  2003.  Shown  on  the  top 
left  of  Figure  5  is  the  name  of  the  MDI  magnetogram  and 
all  NOAA  active  regions  listed  for  the  day.  On  the  top 
right,  the  date  and  time  of  the  magnetogram  are  listed. 


The  table  on  the  bottom  lists  the  results  for  each  magnetic 
island  with  NOAA  active  region  and  full  disk  estimated 
event  rates  and  probability  of  events  along  with  their 
uncertainties.  Each  strong-field  magnetic  island  is  identi¬ 
fied  and  enclosed  in  a  polygon.  In  the  center,  a  full- 
disk  MDI  magnetogram  with  line-of-sight  field  (scaled 
between  ±250  G)  is  shown  where  the  center  of  the  disk 
(red  plus  sign)  and  the  30°  radius  central  disk  (red  circle) 
where  the  magnetic  measures  are  most  accurate  are 
plotted.  NOAA  active  regions  (the  reported  heliographic 
location  of  each  active  region  is  marked  with  a  red  asterisk) 
are  assigned  to  appropriate  strong-field  magnetic  islands 
as  described  in  section  3.  In  this  particular  case,  a  NOAA 
active  region  on  the  west  limb  is  unassigned,  and  four 
active  regions  are  assigned  to  magnetic  island  1.  (Note  that 
the  enclosing  polygon  in  this  case  is  an  example  of  where 
the  polygon  is  not  a  rectangle  due  to  the  small  magnetic 
island  number  3.)  The  magnetic  measures  are  determined 
for  all  strong-field  magnetic  islands  with  at  least  one 
assigned  NOAA  active  region  number.  We  color  code  each 
magnetic  island  using  the  color  scheme  (green,  yellow, 
and  red),  with  thresholds  at  0.01  and  0.1  major  flares  a  day, 
to  indicate  the  level  of  risk  forecast.  Strong  field  magnetic 
islands  without  NOAA  active  regions  assigned  are  colored 
pink  if  within  30  heliocentric  degrees  of  disk  center  and 
blue  if  outside.  For  cases  of  two  or  more  active  regions 
assigned  to  the  same  polygon,  a  plus  sign  is  added  to 
the  active  region  number  with  the  largest  size.  For  active 
regions  beyond  30  degrees,  an  exclamation  point  is  added 
to  indicate  that  the  active  region  is  outside  the  30°  radius 
central  disk,  and  the  predicted  rates  should  be  used  with 
extra  caution. 

[29]  Besides  giving  forecasts  for  each  individual  strong 
magnetic  island  we  also  give  a  forecast  for  the  entire  disk. 
This  is  done  by  summing  up  the  individual  magnetic 
islands  rates.  For  this  reason  for  each  kind  of  event,  the 
assigned  multiplicative  uncertainty  for  the  Disk  forecast  is 
the  multiplicative  uncertainty  of  the  highest  forecasted 
rate.  Normally,  only  the  event  rate  of  one  magnetic  island, 
dominates  the  full-disk  rate.  All  event  rates  are  given  only 
to  one  significant  digit. 

[30]  These  rates  can  then  be  converted  into  all-clear 
event  probabilities  as  functions  of  the  length  of  time  t  of 
the  forecast  interval  using  the  following  relation: 

Prob(f)  =  100%  (e~w),  (5) 

where  Prob  (f)  is  the  absolute  probability  of  having  no 
events  in  time  f,  and  R  is  the  event  rate  [Wheatland,  2001; 
Moon  et  al.,  2001].  Note  than  while  R  can  be  greater  than 
1  per  day,  Prob  ( t )  will  only  asymptotically  approach  0%  as 
R  gets  large.  Reporting  the  expected  event  rate  has  an 
advantage  over  reporting  only  the  event  probability; 
unlike  the  probability  measure,  for  a  given  rate  R  the 
number  of  events  increases  linearly  with  the  length  of  the 
time  interval.  The  disk  all-clear  probabilities  are  listed 
with  uncertainties  in  the  last  two  rows,  and  are  shown 
graphically  on  the  "threat  gauge,"  to  the  right  of  the  MDI 
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Figure  5.  Sample  output  of  the  new  forecast  tool  for  29  October  2003  (an  extremely  active  day  of 
Cycle  23). 


magnetogram,  with  red  showing  the  chance  of  an  event, 
green  the  chance  of  no  event,  and  yellow  showing  the 
range  of  uncertainty  in  the  all-clear  probability.  The 
uncertainty  in  the  all-clear  probabilities  tends  to  be  large 
relative  to  the  all-clear  probability  or  the  event  proba¬ 
bility  (100%  minus  the  all-clear  probability),  whichever  is 
smaller,  and  is  also  larger  for  X-class  flares  and  SPEs 
compared  to  M  and  X  flares  or  CMEs  due  to  poorer 
statistics  (Figure  5).  The  forecasted  event  probabilities 
span  several  decades  but  most  are  actually  negligibly 
small.  The  particular  day  shown  in  the  Figure  5  is  during 
the  passage  of  the  Halloween  2003  active  regions,  one  of 
the  most  event  active  times  during  the  last  cycle  and  is 
not  a  typical  day. 

6.  Discussion 

[31]  We  have  presented  a  description  of  a  new  fore¬ 
casting  tool  developed  for  and  currently  being  tested  by 
NASA's  Space  Radiation  Analysis  Group  at  JSC,  which  is 


responsible  for  monitoring  and  forecasting  of  radiation 
exposure  levels  of  astronauts.  The  new  empirical  fore¬ 
casting  tool  is  based  on  a  proxy  of  an  active  region's  free 
magnetic  energy  that  can  be  measured  from  a  line-of- 
sight  magnetogram  and  that  has  strong  predictive  ability 
for  the  rates  of  active  region's  production  of  M  and  X 
flares,  CMEs,  fast  CMEs  (>800  km/s),  and  Solar  Particle 
Events.  The  tool  uses  the  empirically  determined  power 
law  relationship  between  our  proxy  of  active-region  free 
magnetic  energy  and  the  event  rate.  The  tool  is  automated: 
it  can  take  any  full-disk  MDI  magnetogram,  isolate  strong- 
field  areas,  identify  the  strong-field  areas  with  NOAA 
active  regions,  extract  magnetic  measures,  make  forecasts 
for  individual  magnetic  islands  as  well  as  for  the  full  Sun, 
save  an  entry  for  a  database,  and  output  a  forecast  plot 
(Figure  5).  This  forecast  tool  is  the  first  quantitative  tool 
based  on  a  magnetic  measure  delivered  to  a  space  weather 
forecasting  organization  (NASA/SRAG)  for  potential 
operational  use.  In  contrast  the  McIntosh  [1990]  active- 
region  forecast  scheme,  used  by  NOAA,  is  based  on 
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60  prescribed  active-region  categories;  the  category 
assigned  to  an  active  region  is  determined  by  human 
inspection  of  photosphere  images  of  the  active  region. 

[32]  The  present  algorithm  assumes  that  no  other 
parameter  than  our  tree-energy  proxy  affects  an  active 
region's  event  rate.  It  is  likely  that  active-region  event 
rates  do  depend  on  other  parameters.  We  plan  to  incor¬ 
porate  secondary  measures  (e.g.,  previous  flare  activity), 
that,  when  properly  combined  with  lWLsg,  will  give  a 
more  accurate  forecast  than  using  lWLsg  alone.  We 
believe  our  large  data  set  will  allow  such  investigations  in 
the  future,  and  thus  allow  future  improvements  to  our 
tool. 

[33]  For  SPE  forecasting,  the  algorithm  can  be  improved 
by  the  addition  of  more  physics.  The  empirical  SPE  rate  is 
found,  presently,  for  active  regions  observed  within 
30  heliocentric  degrees  from  the  disk  center.  Work  by 
Reames  [1999],  and  others  have  shown  dependence  of  SPE 
occurrence  at  Earth  on  the  longitude  of  the  source  of  the 
driving  eruption,  which  our  algorithm  does  not  include. 
Most  SPEs  at  Earth  come  from  active  regions  in  western 
longitudes.  With  MDI  line-of-sight  magnetograms  the 
error  in  the  measurement  of  lWLsg  from  active  regions 
beyond  30  degrees  from  disk  center  could  easily  swamp 
the  longitudinal  dependence,  but  with  HMI  vector  mag¬ 
netograms  the  longitudinal  dependence  of  the  rate  of 
production  of  SPEs  observed  at  Earth  by  active  regions 
can  be  taken  into  account.  Also,  coupling  the  forecast 
flare/CME  rate  for  an  active  region  with  heliospheric 
models  should,  in  principal,  result  in  an  improved  forecast 
of  the  chance  of  an  SPE  at  Earth. 

[34]  The  new  forecasting  technique,  however,  has  two 
weaknesses:  lack  of  magnetic  observations  of  active 
regions  on  or  behind  the  limb  and  the  fact  that  no  attempt 
at  forecasting  quiet-region  prominence  eruption  has  been 
made.  The  lack  of  limb  observations  can  be  partially 
addressed  for  west  limb  events  which  are  more  likely  to 
produce  SPEs  than  east  limb  events  [Balch,  2008]  (26  West 
Limb,  2  East  limb,  out  of  a  sample  of  165  SPEs)  by  using 
the  last  (furthest  west)  good  evaluation  for  longer  fore¬ 
casts.  Forecasting  for  active  regions  on  the  East  limb 
would  need  to  use  forecaster  estimates  based  on  STEREO 
observation  (http://stereo-ssc.nascom.nasa.gov/),  farside 
helioseismology,  and  recent  history  of  the  active  region 
rotating  onto  the  disk.  Placing  magnetographs  in  the 
Earth/Sun  L4  and  L5  points  would  supply  the  observa¬ 
tions  needed  for  using  this  forecasting  technique  for  active 
regions  near  and  beyond  the  east  limb.  The  development 
of  a  forecasting  technique  for  quiet-region  prominence 
eruptions  would  improve  forecasting  of  CMEs.  The  asso¬ 
ciated  flares  though  are  normally  weak,  and  these  CMEs 
rarely  produce  SPEs,  so  forecasts  of  X  and  M  flares  or 
SPEs  will  not  be  improved  significantly. 

[35]  The  tool  is  presently  based  on  using  MDI  line-of- 
sight  magnetograms,  but  now  that  SDO  with  HMI  is 
launched,  we  will  be  able  to  make  the  tool  better  due  to 


HMI's  advantages  over  MDI.  These  advantages  include 
vector  magnetograms,  higher  resolution,  reduced  latency, 
and  faster  cadence.  The  vector  magnetograms  can  be 
deprojected  to  disk  center  (convert  line-of-sight  and 
transverse  field  to  vertical  and  horizontal  magnetic  fields) 
and  so  remove  the  line-of-sight  approximation,  and  thus 
more  accurately  measure  our  proxy  of  free  magnetic 
energy  in  active  regions  further  from  disk  center.  Also, 
several  proxies  of  free  magnetic  energy  that  are  measured 
from  the  horizontal  magnetic  field  component  will 
be  obtainable  from  the  HMI  vector  magnetograms.  To 
determine  if  any  of  these  proxies  are  better  than  lWLsG 
and  to  develop  a  usefully  large  database  will  take  many 
years  of  observations.  The  higher  resolution  will  tend  to 
result  in  HMI  measuring  a  larger  gradient  along  the 
neutral  line  than  does  MDI  for  the  same  active  region,  at 
the  same  time.  By  using  either  the  overlap  between  MDI 
and  HMI  observations,  or  if  both  do  not  observe  enough 
active  regions,  by  chaining  through  either  SOLIS  or 
Hinode  vector  magnetograms  this  effect  can  be  calibrated 
out  so  that  the  MDI  database  can  be  used  for  HMI  mag¬ 
netograms.  SOHO/MDI  is  currently  operating  with 
reduced  data  throughput  and  does  not  bring  down  and 
make  calibrated  data  readily  available.  During  the  solar 
minimum  this  is  not  so  critical  since  active  region  driven 
events  are  rare  and  soon  it  will  be  replaced  by  HMI,  which 
will  have  short  latency.  Some  delay  (several  hours)  is 
acceptable  since  lWLsg  tends  to  evolve  on  timescales  of  a 
day  or  more. 
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